# Clothing1MPP

## [Pinned] Please USE Pull Request to submit your commits

[Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests)

## [Pinned] Before you create a PR, PLEASE do the following to check your code format

## [Pinned] Please add your testing results below BEFORE you create a PR

```bash {"id":"01J962526X6PC84ME095C5VVH3"}
cd [Your repo path]
isort --profile black .
black .
```

## Supported Algorithms

- [x] Vanilla Training
- [x] Coteaching Training (https://github.com/bhanML/Co-teaching)
- [x] TaylorCE
- [x] LogitClip

## Examples

### Vanilla Training

#### CIFAR

```bash {"id":"01J962526X6PC84ME097B1CP0Y"}
python examples/main_cifar.py --config configs/cifar10/default.yaml
```

#### Clothing1MPP

```bash {"id":"01J962526X6PC84ME09AJ5S0D2"}
python examples/main.py --config configs/Clothing1MPP/default.yaml
```

### Coteaching

#### CIFAR

```bash {"id":"01J962526X6PC84ME09DWC3SPS"}
python examples/main_cifar_coteaching.py --config configs/cifar10/aggre-coteaching.yaml
```

## How to use the dataset

```python {"id":"01J962526X6PC84ME09FFFQVQ7"}
train_set = Clothing1mPP(root, image_size, split="train")
tiny_set_ids = train_set.get_tiny_ids(seed=0)
tiny_train_set = Subset(train_set, tiny_set_ids) # Get the tiny version of the dataset
val_set = Clothing1mPP(
    root, image_size, split="val", pre_load=train_set.data_package
)
test_set = Clothing1mPP(
    root, image_size, split="test", pre_load=train_set.data_package
)

train_loader = DataLoader(
    train_set, batch_size=batch_size, shuffle=True, num_workers=num_workers
)
tiny_train_loader = DataLoader(
    tiny_train_set, batch_size=batch_size, shuffle=True, num_workers=num_workers
)
val_loader = DataLoader(
    val_set, batch_size=batch_size, shuffle=False, num_workers=num_workers
)
test_loader = DataLoader(
    test_set, batch_size=batch_size, shuffle=False, num_workers=num_workers
)
```

## Training your Model

You can train your model on either the cifar10 dataset or the clothing1mpp dataset.

### Train on Cifar10

`python main.py --config=configs/cifar10/default.yaml`

### Train on clothing1mpp

The data for this project can be found in this [Google Drive](https://drive.google.com/file/d/1U-NXvHfmUUqL1l5_PspBIcJwXrlxHZe4/view?usp=sharing). (Baidu Drive Comine soon). Download the clothing1mpp dataset and modify the `root` path in config file.

`python main.py --config=configs/clothing1mpp/default.yaml`

## Customizing the Training Process

Here's how you can customize different components to accomodate your method:

- __Training loop__: Use a super class over `Trainer` found at `train_loops/trainer_default.py`
- __Vision Backbones__: Refer to the examples in the `model_factory.py`
- __Optimizer__: Refer to the examples in the `optimizer_factory.py` file
- __Loss function__: Refer to the examples in the `loss_function_factory.py` file

## Benchmark

### CIFAR10-N-worst

|    Method   | Epoch | Model Arch | pre-trained weight |   lr   | batch size | Testing Acc. |
|:-----------:|:-----:|:----------:|:------------------:|:------:|:----------:|:------------:|
|      CE     |  100  |  ResNet-18 |          -         |  0.01  |     128    |     76.35    |
| Co-teaching |  100  |  ResNet-18 |          -         |  0.001 |     128    |     80.71    |
|  LogitClip  |  100  |  ResNet-18 |          -         |   0.1  |     128    |     83.79    |
|   TaylorCE  |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     79.75    |
|    Jocor    |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     79.20    |

### CIFAR10-N-random

|    Method   | Epoch | Model Arch | pre-trained weight |   lr   | batch size | Testing Acc. |
|:-----------:|:-----:|:----------:|:------------------:|:------:|:----------:|:------------:|
|      CE     |  100  |  ResNet-18 |          -         |  0.01  |     128    |     83.55    |
| Co-teaching |  100  |  ResNet-18 |          -         |  0.001 |     128    |     90.77    |
|  LogitClip  |  100  |  ResNet-18 |          -         |   0.1  |     128    |     90.79    |
|   TaylorCE  |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     87.17    |
|    Jocor    |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     88.96    |

### CIFAR10-N-aggre

|    Method   | Epoch | Model Arch | pre-trained weight |   lr   | batch size | Testing Acc. |
|:-----------:|:-----:|:----------:|:------------------:|:------:|:----------:|:------------:|
|      CE     |  100  |  ResNet-18 |          -         |  0.01  |     128    |     87.84    |
| Co-teaching |  100  |  ResNet-18 |          -         |  0.001 |     128    |     91.54    |
|  LogitClip  |  100  |  ResNet-18 |          -         |   0.1  |     128    |     91.14    |
|   TaylorCE  |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     89.06    |
|    Jocor    |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     88.47    |

### CIFAR100-N

|    Method   | Epoch | Model Arch | pre-trained weight |   lr   | batch size | Testing Acc. |
|:-----------:|:-----:|:----------:|:------------------:|:------:|:----------:|:------------:|
|      CE     |  100  |  ResNet-18 |          -         |  0.01  |     128    |     51.51    |
| Co-teaching |  100  |  ResNet-18 |          -         |  0.001 |     128    |     55.05    |
|  LogitClip  |  100  |  ResNet-18 |          -         |   0.1  |     128    |     62.58    |
|   TaylorCE  |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     52.39    |
|    Jocor    |  100  |  ResNet-18 |          -         | 0.0001 |     128    |     54.75    |

### Clothing1MPP

<!-- |    Method   | Epoch | Model Arch | pre-trained weight |   lr  | batch size | Testing Acc. |
|:-----------:|:-----:|:----------:|:------------------:|:-----:|:----------:|:------------:|
|      CE     |  100  |  ResNet-18 |          -         |  0.01 |     128    |     76.35    |
| Co-teaching |  100  |  ResNet-18 |          -         | 0.001 |     128    |     80.71    |
|  LogitClip  |  100  |  ResNet-18 |          -         |  0.1  |     128    |              |
|   TaylorCE  |  100  |  ResNet-18 |          -         |       |            |              |

|             | Epoch |      Dataset     | Model Arch | pre-trained weight |   lr  | batch size |  Testing Acc. |
|:-----------:|:-----:|:----------------:|:----------:|:------------------:|:-----:|:----------:|:-------------:|
|      CE     |   20  |   Clothing1MPP   |  ResNet-18 |    ImageNet1K v2   |  0.01 |     64     |               |
|      CE     |   20  |   Clothing1MPP   |  ResNet-18 |    ImageNet1K v1   |  0.02 |     64     |     76.27     |
|      CE     |   20  |   Clothing1MPP   |  ResNet-34 |    ImageNet1K v1   |  0.02 |     64     |               |
|      CE     |   20  |   Clothing1MPP   |  ResNet-50 |    ImageNet1K v1   |  0.02 |     256    |               |
|      CE     |   20  |   Clothing1MPP   |  ResNet-50 |    ImageNet1K v2   |  0.02 |     256    |               |
|      CE     |   20  |   Clothing1MPP   |  ResNet-50 |    ImageNet1K v1   |  0.01 |     64     |               |
|      CE     |   20  |   Clothing1MPP   |  ResNet-50 |    ImageNet1K v2   |  0.01 |     64     |               |

## Cloth1MPP Dataset details

### File structure of the dataset:

```ini {"id":"01J962526X6PC84ME09GJZKFA6"}
root/
├── splits_labels/
│   ├── labels_meta.json   # All labels of each sample and meta information
│   ├── test_ids.pt        # ids of the test set
│   ├── train_ids.pt       # ids of the train set
│   └── val_ids.pt         # ids of the validation set
├── images/
│   ├── Dress/
│   ├── Hoodie/
│   ├── Jacket/
│   ├── Knitwear/
│   ├── Shawl/
│   ├── Shirt/
│   ├── Suit/
│   ├── Sweater/
│   ├── T-shirt/
│   ├── Underwear/
│   ├── Vest/
│   └── Windbreaker/
```

### Labels

`labels_meta.json` file which contains labels to each sample.

```js {"id":"01J962526X6PC84ME09J07H37P"}
{
    "labels": [
        {
            "id": 65,
            "file_name": "65.s-l500.jpg",
            "file_path": "/Sweater/Black_Cotton_Cable knit/65.s-l500.jpg",
            "Labels": "Sweater",
            "attributes": {
                "Color": "Black",
                "Material": "Cotton",
                "Pattern": "Cable knit"
            }
        },
        {
            "id": 37,
            "file_name": "37.ralph-lauren-black-polo-cable-knit-turtleneck-sweater-product-0-288043173-normal.jpg",
            "file_path": "/Sweater/Black_Cotton_Cable knit/37.ralph-lauren-black-polo-cable-knit-turtleneck-sweater-product-0-288043173-normal.jpg",
            "Labels": "Sweater",
            "attributes": {
                "Color": "Black",
                "Material": "Cotton",
                "Pattern": "Cable knit"
            }
        },
        ...
    ],
    "meta_data": {
        "Sweater": {
            "Color": [
                "Black",
                "White",
                "Grey",
                ...
            ],
            "Material": [
                "Cotton",
                "Wool",
                "Cashmere",
                ...
            ],
            "Pattern": [
                "Cable knit",
                "Ribbed",
                "Fair Isle",
                ...
            ]
        },
        ...
    }
```

### Split of train, validation and test set.

The `XXX.pt` file is used to find the sample ids of each split.

- The train set: noisy labels and the noise rate is yet to be figured out.
- The validation: mostly clean,
- The test: set we strongly believe that it is clean.

### Feature Merge History

- [2024.03.17] Co-teaching is supported
- [2024.04.10] TaylorCE, Jocor and LogitClip are supported

